-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preserve client ID when using Kubernetes storage #1536
base: master
Are you sure you want to change the base?
Conversation
Similarly to my comment in #1538 - you're not supposed to modify custom resources directly. @srenatus - should modifying CRs directly be an option? Personally I think it would be akin to going off and doing random sql operations in a database for an app instead of using the app's intended mechanisms to make such changes. |
Hey @venezia - thanks for the review (again)! 😜
I am struggling a bit here to see what actual constraints prevent us from striving to improve dex to support a more 'kubernetes-native' workflow. It is great that dex supports CRDs as a config storage mechanism. But, it seems like a waste not to support the usual workflow of using CRDs. Technically, beyond this issue, the most dynamic parts of the dex config (clients and connectors) seem to support the CRD approach already. Adding/updating/removing clients through the CRD API is fully supported, and, the same goes for connectors. There are some advantages to the CRD approach that the gRPC approach does not offer:
I think, right now, the CRD implementation is just not ideal, as you pointed out in #1538 and others in #1308 and #1333. I would be happy to contribute further to help improve that integration. |
I could see it a few ways @erwinvaneyk On one hand, Kubernetes custom resources are not exactly a stellar storage vehicle (there are no user-defined indexes (I think its just apiVersion+kind), etcd just stores it as a json blob, under the hood, listwatch isn't that performant, k8s aims for "eventually consistent" but no atomic / transactions are provided, etc.) However, the Kubernetes API is a great take on things - declare something and controllers work to make it happen. On a personal note, I think Custom Resources make a lot of sense when you're trying to have them be used in a way that, in the end, results in Kubernetes primitives (like a etcd operator, for example) In this situation, we're essentially storing data and the end result is not some controller converting our resources into core primitives, but rather we're simply storing application data. Storing such data in Kubernetes is not very different from storing your database in say, windows registry. You can do it, but I'm unsure if it is great. (And to be clear, I do do this, and again this is just my opinion, and I have seen people try to store 500kb resources instead of storing them as files - that's a lot of data pouring across the wire) I think that for better performance you're likely going to want to use a different storage vehicle, as the documentation suggests. If nothing else than increased portability - its "hard" to move custom resources from one cluster to another cluster, but there are a lot of run books on how to replicate and migrate say MySQL to another environment. I think leveraging Kubernetes custom resources is a great way to lower the "investment" users must make in order to try out dex. I think it works on a certain scale, but I'm unsure if you would use this setup for say 300k users, for all the aforementioned reasons. The problem I have with embracing users using To your point with seeding a deployment with data (I believe you like the idea of But in the vernacular of kubernetes and helm, what would stop us from creating a job or post-install hook that would wait until things are working and then configure the application's database (in this case custom resources) using the grpc endpoint? Wouldn't that be a storage-driver-agnostic approach? And in doing so, we would be able to help users not just using Kubernetes custom resources, but also etcd, sql, etc. To be clear, I like Kubernetes, I do many things in Kubernetes, but I also think that we should strive for solutions that work across all supported storage vehicles instead of just one. I'm also not necessarily against making things support kubernetes-native, but at the same time I'm unsure if it is a solid idea to support multiple vectors. |
I completely agree with you, @venezia. I think we are on the same page w.r.t. the pros, cons, patterns, antipatterns of using CRDs. The CRD + operator 'pattern' is great, and also the main use case for CRDs. Dex indeed differs in that it uses CRDs less as "desired state" and more as just configuration storage. I think the main thing that our perspectives differ on is the use case. My use case for this is just a prototype/development. Right now, I care about is ease of use, minimizing complexity, and avoiding complex workflows (which seems to be main reasons why dex supports CRDs at all). For my prototype situation, the complexity outweighs performance and portability concerns that you pointed out (though those are indeed valid concerns at scale!) I am looking at this from a very pragmatic point of view: Dex supports CRD-based storage. What I want is to be able to create and edit a client CRD at start and runtime. Dex, being stateless, supports this already, except for a single, minor issue, which this PR addresses. Fixing this issue does not open a new vector (at least not anymore than there already was). I think I made a mistake of emphasizing the workflow/vector in this PR too much. 😉 The actual thing this PR is about is changing the resource name of the client to the client ID, which is more readable and easier to reference: Before this change: $ kubectl get oauth2clients
client/apw11euz23 After this change:
Would purely this change be agreeable to you, @venezia ? |
What about Are you suggesting we should have hashes for the IDs of some custom resources (oauth2clients), but not others (passwords)? Will that inconsistency to resource naming not confuse someone even more? I agree with you that lowering the amount of effort a user has to invest to administrate a dex installation is a great idea - but again, should we just create a storage-driver-agnostic mechanism to seed the data into the system through the front door? (the grpc API) I don't think you disagreed with using the grpc api, you just did not like having to wait for it to become ready - which is understandable, and why I suggested a job/post-install helm chart hook to do this. FWIW, I was actually working on a dex administrative website to interact with the grpc api because I did want to make changes to dex easy. I initially was going to just do things through the custom resources directly and read the documentation discouraged modifying CRs directly - so we both want similar things, honestly. (though again, doing one that worked with grpc api would work across all storage mechanisms, which might make it a more useful-to-all option) |
It is already inconsistent right now. Only the OAuth2Clients and Passwords CRDs use encoded names, other CRDs (such as Connectors) do not encode the name. I personally think this change will improve the situation, since with Password there is a clear reason why it is encoded (email is used as the name) whereas with OAuth2Client it seems arbitrary.
No, indeed, I don't disagree. But, let's try scope this discussion a bit and focus on the contents of this PR. (see last part of the previous comment) 😁
Unrelated to this PR, that sounds interesting to me. Are you planning to open-source it? |
In that case, accept my apologies for being a nuisance, @erwinvaneyk . If we're already inconsistent, no harm done. That said, do you think you could add backward compatibility / migration? I could be wrong, (and have been) but it looks like an existing user might have problems if they upgraded to a release with this code. |
No worries at all @venezia - I prefer PRs being properly reviewed and scrutinized. 😉
Yes, we can. We could add a parameter for the kubernetes storage configuration, which, if set, preserves the client-ids as proposed in this PR. I am just waiting a bit for further reviews/feedback on this change in general, before spending the time to actually implement that. |
This is more of a proposal.
Currently, when using the Kubernetes CRD storage, all clients IDs are encoded. Although this allows for a more extensive charset, it basically makes the client identifier unguessable/unconstructable without the DEX client. This is cumbersome to overcome if you are managing DEX through the CRDs rather through the DEX API.
So, I propose to not encode the client ID, since it doesn't seem like a necessity for the client IDs.
An alternative implementation would be to add an option to the Kubernetes storage config "PreserveClientID", which will only preserve the client ID if enabled.